28 research outputs found

    Multi-resolution two-sample comparison through the divide-merge Markov tree

    Full text link
    We introduce a probabilistic framework for two-sample comparison based on a nonparametric process taking the form of a Markov model that transitions between a "divide" and a "merge" state on a multi-resolution partition tree of the sample space. Multi-scale two-sample comparison is achieved through inferring the underlying state of the process along the partition tree. The Markov design allows the process to incorporate spatial clustering of differential structures, which is commonly observed in two-sample problems but ignored by existing methods. Inference is carried out under the Bayesian paradigm through recursive propagation algorithms. We demonstrate the work of our method through simulated data and a real flow cytometry data set, and show that it substantially outperforms other state-of-the-art two-sample tests in several settings.Comment: Corrected typos. Added Software sectio

    Choosing a Proxy Metric from Past Experiments

    Full text link
    In many randomized experiments, the treatment effect of the long-term metric (i.e. the primary outcome of interest) is often difficult or infeasible to measure. Such long-term metrics are often slow to react to changes and sufficiently noisy they are challenging to faithfully estimate in short-horizon experiments. A common alternative is to measure several short-term proxy metrics in the hope they closely track the long-term metric -- so they can be used to effectively guide decision-making in the near-term. We introduce a new statistical framework to both define and construct an optimal proxy metric for use in a homogeneous population of randomized experiments. Our procedure first reduces the construction of an optimal proxy metric in a given experiment to a portfolio optimization problem which depends on the true latent treatment effects and noise level of experiment under consideration. We then denoise the observed treatment effects of the long-term metric and a set of proxies in a historical corpus of randomized experiments to extract estimates of the latent treatment effects for use in the optimization problem. One key insight derived from our approach is that the optimal proxy metric for a given experiment is not apriori fixed; rather it should depend on the sample size (or effective noise level) of the randomized experiment for which it is deployed. To instantiate and evaluate our framework, we employ our methodology in a large corpus of randomized experiments from an industrial recommendation system and construct proxy metrics that perform favorably relative to several baselines

    The Role of Uric Acid in Acute and Chronic Coronary Syndromes.

    Get PDF
    Uric acid (UA) is the final product of the catabolism of endogenous and exogenous purine nucleotides. While its association with articular gout and kidney disease has been known for a long time, new data have demonstrated that UA is also related to cardiovascular (CV) diseases. UA has been identified as a significant determinant of many different outcomes, such as all-cause and CV mortality, and also of CV events (mainly Acute Coronary Syndromes (ACS) and even strokes). Furthermore, UA has been related to the development of Heart Failure, and to a higher mortality in decompensated patients, as well as to the onset of atrial fibrillation. After a brief introduction on the general role of UA in CV disorders, this review will be focused on UA's relationship with CV outcomes, as well as on the specific features of patients with ACS and Chronic Coronary Syndrome. Finally, two issues which remain open will be discussed: the first is about the identification of a CV UA cut-off value, while the second concerns the possibility that the pharmacological reduction of UA is able to lower the incidence of CV events

    Mortality and pulmonary complications in patients undergoing surgery with perioperative SARS-CoV-2 infection: an international cohort study

    Get PDF
    Background: The impact of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) on postoperative recovery needs to be understood to inform clinical decision making during and after the COVID-19 pandemic. This study reports 30-day mortality and pulmonary complication rates in patients with perioperative SARS-CoV-2 infection. Methods: This international, multicentre, cohort study at 235 hospitals in 24 countries included all patients undergoing surgery who had SARS-CoV-2 infection confirmed within 7 days before or 30 days after surgery. The primary outcome measure was 30-day postoperative mortality and was assessed in all enrolled patients. The main secondary outcome measure was pulmonary complications, defined as pneumonia, acute respiratory distress syndrome, or unexpected postoperative ventilation. Findings: This analysis includes 1128 patients who had surgery between Jan 1 and March 31, 2020, of whom 835 (74·0%) had emergency surgery and 280 (24·8%) had elective surgery. SARS-CoV-2 infection was confirmed preoperatively in 294 (26·1%) patients. 30-day mortality was 23·8% (268 of 1128). Pulmonary complications occurred in 577 (51·2%) of 1128 patients; 30-day mortality in these patients was 38·0% (219 of 577), accounting for 81·7% (219 of 268) of all deaths. In adjusted analyses, 30-day mortality was associated with male sex (odds ratio 1·75 [95% CI 1·28–2·40], p\textless0·0001), age 70 years or older versus younger than 70 years (2·30 [1·65–3·22], p\textless0·0001), American Society of Anesthesiologists grades 3–5 versus grades 1–2 (2·35 [1·57–3·53], p\textless0·0001), malignant versus benign or obstetric diagnosis (1·55 [1·01–2·39], p=0·046), emergency versus elective surgery (1·67 [1·06–2·63], p=0·026), and major versus minor surgery (1·52 [1·01–2·31], p=0·047). Interpretation: Postoperative pulmonary complications occur in half of patients with perioperative SARS-CoV-2 infection and are associated with high mortality. Thresholds for surgery during the COVID-19 pandemic should be higher than during normal practice, particularly in men aged 70 years and older. Consideration should be given for postponing non-urgent procedures and promoting non-operative treatment to delay or avoid the need for surgery. Funding: National Institute for Health Research (NIHR), Association of Coloproctology of Great Britain and Ireland, Bowel and Cancer Research, Bowel Disease Research Foundation, Association of Upper Gastrointestinal Surgeons, British Association of Surgical Oncology, British Gynaecological Cancer Society, European Society of Coloproctology, NIHR Academy, Sarcoma UK, Vascular Society for Great Britain and Ireland, and Yorkshire Cancer Research

    Bayesian Methods for Two-Sample Comparison

    No full text
    <p>Two-sample comparison is a fundamental problem in statistics. Given two samples of data, the interest lies in understanding whether the two samples were generated by the same distribution or not. Traditional two-sample comparison methods are not suitable for modern data where the underlying distributions are multivariate and highly multi-modal, and the differences across the distributions are often locally concentrated. The focus of this thesis is to develop novel statistical methodology for two-sample comparison which is effective in such scenarios. Tools from the nonparametric Bayesian literature are used to flexibly describe the distributions. Additionally, the two-sample comparison problem is decomposed into a collection of local tests on individual parameters describing the distributions. This strategy not only yields high statistical power, but also allows one to identify the nature of the distributional difference. In many real-world applications, detecting the nature of the difference is as important as the existence of the difference itself. Generalizations to multi-sample comparison and more complex statistical problems, such as multi-way analysis of variance, are also discussed.</p>Dissertatio

    Analysis of Distributional Variation Through Graphical Multi-Scale Beta-Binomial Models

    No full text
    <p>Many scientific studies involve comparing multiple datasets collected under different conditions to identify the difference in the underlying distributions. A common challenge in these multi-sample comparison problems is the presence of overdispersion, or extraneous causes other than the conditions of interest that also contribute to the cross-sample difference, which frequently results in false findings—identified “differences” not replicable in follow-up studies. When proper replicate samples are available under the conditions, one can in principle identify the interesting distributional variation from overdispersion through what we call the “analysis of distributional variation” (ANDOVA). We introduce a fully probabilistic framework for ANDOVA that achieves high computational efficiency. We take a divide-and-conquer multi-scale inference strategy: (i) first transform a general nonparametric ANDOVA task into a collection of ANDOVA tasks on Binomial experiments—each characterizing variations in the distributions at a particular location and scale, (ii) address each Binomial ANDOVA using a Beta-Binomial (BB) model, and (iii) use hierarchical graphical modeling to combine the inference from the BB models. We derive efficient MCMC-free Bayesian inference recipe under this framework through a combination of Laplace approximation-based numerical integration and message passing, and evaluate the performance of our method through extensive simulation. We apply the framework to analyzing DNase-seq data for identifying differences in transcriptional factor binding. Supplementary material for this article is available online.</p
    corecore